- Home
- Search Results
- Page 1 of 1
Search for: All records
-
Total Resources6
- Resource Type
-
0005100000000000
- More
- Availability
-
06
- Author / Contributor
- Filter by Author / Creator
-
-
Bao, Yujia (6)
-
Liu, Yang (6)
-
Pang, Jinlong (6)
-
Wang, Yaxuan (6)
-
Wei, Wei (6)
-
Liu, Chris Yuhao (4)
-
Liu, Quan (4)
-
Shah, Ankit (4)
-
Wei, Jiaheng (4)
-
Qian, Chen (2)
-
Zhu, Zhaowei (2)
-
#Tyler Phillips, Kenneth E. (0)
-
#Willis, Ciara (0)
-
& Abreu-Ramos, E. D. (0)
-
& Abramson, C. I. (0)
-
& Abreu-Ramos, E. D. (0)
-
& Adams, S.G. (0)
-
& Ahmed, K. (0)
-
& Ahmed, Khadija. (0)
-
& Aina, D.K. Jr. (0)
-
- Filter by Editor
-
-
& Spizer, S. M. (0)
-
& . Spizer, S. (0)
-
& Ahn, J. (0)
-
& Bateiha, S. (0)
-
& Bosch, N. (0)
-
& Brennan K. (0)
-
& Brennan, K. (0)
-
& Chen, B. (0)
-
& Chen, Bodong (0)
-
& Drown, S. (0)
-
& Ferretti, F. (0)
-
& Higgins, A. (0)
-
& J. Peters (0)
-
& Kali, Y. (0)
-
& Ruiz-Arias, P.M. (0)
-
& S. Spitzer (0)
-
& Sahin. I. (0)
-
& Spitzer, S. (0)
-
& Spitzer, S.M. (0)
-
(submitted - in Review for IEEE ICASSP-2024) (0)
-
-
Have feedback or suggestions for a way to improve these results?
!
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available July 13, 2026
-
Wang, Yaxuan; Liu, Quan; Liu, Chris Yuhao; Pang, Jinlong; Wei, Wei; Bao, Yujia; Liu, Yang (, ICML 2025 Workshop on Machine Unlearning for Generative AI)Free, publicly-accessible full text available July 13, 2026
-
Pang, Jinlong; Wei, Jiaheng; Shah, Ankit; Zhu, Zhaowei; Wang, Yaxuan; Qian, Chen; Liu, Yang; Bao, Yujia; Wei, Wei (, The Thirteenth International Conference on Learning Representations)Free, publicly-accessible full text available April 24, 2026
-
Pang, Jinlong; Wei, Jiaheng; Shah, Ankit; Zhu, Zhaowei; Wang, Yaxuan; Qian, Chen; Liu, Yang; Bao, Yujia; Wei, Wei (, The Thirteenth International Conference on Learning Representations)Instruction tuning is critical for adapting large language models (LLMs) to downstream tasks, and recent studies have demonstrated that small amounts of human-curated data can outperform larger datasets, challenging traditional data scaling laws. While LLM-based data quality rating systems offer a cost-effective alternative to human annotation, they often suffer from inaccuracies and biases, even in powerful models like GPT-4. In this work, we introduce DS2, a Diversity-aware Score curation method for Data Selection. By systematically modeling error patterns through a score transition matrix, DS2 corrects LLM-based scores and promotes diversity in the selected data samples. Our approach shows that a curated subset (just 3.3% of the original dataset) outperforms full-scale datasets (300k samples) across various machine-alignment benchmarks, and matches or surpasses human-aligned datasets such as LIMA with the same sample size (1k samples). These findings challenge conventional data scaling assumptions, highlighting that redundant, low-quality samples can degrade performance and reaffirming that "more can be less."more » « lessFree, publicly-accessible full text available April 24, 2026
-
Wang, Yaxuan; Wei, Jiaheng; Liu, Chris Yuhao; Pang, Jinlong; Liu, Quan; Shah, Ankit; Bao, Yujia; Liu, Yang; Wei, Wei (, Thirteenth International Conference on Learning Representations)Free, publicly-accessible full text available April 24, 2026
-
Wang, Yaxuan; Wei, Jiaheng; Liu, Chris Yuhao; Pang, Jinlong; Liu, Quan; Shah, Ankit; Bao, Yujia; Liu, Yang; Wei, Wei (, Thirteenth International Conference on Learning Representations)Free, publicly-accessible full text available April 24, 2026
An official website of the United States government
